3D Inverted Index with Cache Sharing for Web Search Engines
نویسندگان
چکیده
Web search engines achieve efficient performance by partitioning and replicating the indexing data structure used to support query processing. Current practice simply partitions and replicates the text collection on the set of cluster processors and then constructs in each processor an index data structure. This paper proposes a different approach by constructing an index data structure that properly considers the fact that data is partitioned and replicated. This leads to a so-called 3D indexing strategy that outperforms current approaches. Performance is further boosted by introducing an application caching scheme devised to hold most frequently issued queries.
منابع مشابه
Fast Arabic Query Matching for Compressed Arabic Inverted Indices
Information retrieval systems and Web search engines apply highly optimized techniques for compressing inverted indices. These techniques reduce index sizes and improve the performance of query processing that uses compressed indices to find relevant documents for the users' queries. In this paper, we proposed a novel technique for querying compressed Arabic inverted indices in search engines. ...
متن کاملInverted indexes: Types and techniques
There has been a s ubstantial amount of research on high performance inverted index because most web and search engines use an inverted index to execute queries. Documents are normally stored as lists of words, but inverted indexes invert this by storing for each word the list of documents that the word appears in, hence the name “inverted index”. This paper presents the crucial research findin...
متن کاملDistributed search based on self-indexed compressed text
Query response times within a fraction of a second in Web search engines are feasible due to the use of indexing and caching techniques, which are devised for large text collections partitioned and replicated into a set of distributed memory processors. This paper proposes an alternative query processing method for this setting, which is based on a combination of self-indexed compressed text an...
متن کاملCLIP: A Compact, Load-balancing Index Placement Function
Existing file searching tools do not have the performance or accuracy that search engines have. This is especially a problem in large-scale distributed file systems, where better-performing file searching tools are much needed for enterprise-level systems. Search engines use inverted indices to store terms and other metadata. Although some desktop file searching tools use indices to store file ...
متن کاملBuilding a peer-to-peer full-text Web search engine with highly discriminative keys
Web search engines designed on top of peer-to-peer (P2P) overlay networks show promise to enable attractive search scenarios operating at a large scale. However the design of effective indexing techniques for extremely large document collections still raises a number of open technical challenges. Resource sharing, self-organization, and low maintenance costs are favorable properties of P2P over...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012